Revert VLM support in `parse_response` by qgallouedec · Pull Request #5561 · huggingface/trl

qgallouedec · 2026-04-15T18:25:26Z

parse_response previously accepted either a tokenizer or a processor (from #5323) and unwrapped the inner tokenizer on the fly. Now that call sites can easily pass the tokenizer directly, we move that disambiguation to the call sites and keep parse_response strictly tokenizer-only. This centralizes the "processor vs tokenizer" logic in one place per trainer and makes parse_response's contract simpler.

Note

Medium Risk
This is a small but potentially breaking API change: callers that previously passed a VLM ProcessorMixin to parse_response must now pass processor.tokenizer, which could affect downstream integrations.

Overview
Simplifies parse_response’s contract to accept only a PreTrainedTokenizerBase (removing the internal “processor vs tokenizer” unwrapping logic) and updates its docstring accordingly.

Adjusts TestParseResponse to explicitly pass processing_class.tokenizer for VLM processors, keeping parsing behavior the same while moving the disambiguation to call sites.

^{Reviewed by Cursor Bugbot for commit fb245d4. Bugbot is set up for automated code reviews on this repo. Configure here.}

HuggingFaceDocBuilderDev · 2026-04-15T18:28:48Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

albertvillanova

I'm not sure of this PR, as it seems to overlap quite a bit with what I'm currently addressing in #5489. The approach I’m taking there is intentionally incremental:

First, ensure that we can consistently pass only tokenizer instances to parse_response (by introducing self._tokenizer across trainers).
Then, in a follow-up step, simplify parse_response to only accept tokenizers: Make parse_response accept only tokenizer

Given that, this PR feels somewhat like duplicated effort. Would it make sense to wait for #5489 to land instead?

For context, I’m already working through the relevant discussion here: #5489 (comment)

parse_response only needs a tokenizer instance but it had to handle both because we did not have a simple way to pass only tokenizer. Once we implement self._tokenizer in all trainers, parse_response could be simplified to accept only tokenizer instances.

and here: #5489 (comment)

More broadly, the underlying goal of this PR is to centralize the processor/tokenizer disambiguation within processing_class in a single place, so that the rest of the code can rely on a well-defined and consistent interface, with a clear expected class instance.

In that sense, the current change in calling parse_response is an intermediate step toward that simplification, rather than a deviation from it.

qgallouedec · 2026-04-16T12:56:26Z

Ah yes ok, Lgtm @albertvillanova

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 973cf25. Configure here.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fffa7a79d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Revert VLM support in parse_response

8618871

qgallouedec mentioned this pull request Apr 15, 2026

Set _tokenizer as trainer attribute #5489

Merged

albertvillanova reviewed Apr 16, 2026

View reviewed changes

Merge branch 'main' into revert-vlm-support-in-parse_response

973cf25

cursor Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread trl/chat_template_utils.py

fix

cc3905c

qgallouedec marked this pull request as draft April 22, 2026 18:45

Merge branch 'main' into revert-vlm-support-in-parse_response

fffa7a7

qgallouedec marked this pull request as ready for review April 29, 2026 15:56

empty for ci trigger

8a5ffa4

chatgpt-codex-connector Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread trl/chat_template_utils.py Outdated

fix

58859f5

qgallouedec requested review from AmineDiro, albertvillanova and kashif April 29, 2026 16:31

Merge branch 'main' into revert-vlm-support-in-parse_response

fb245d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert VLM support in `parse_response`#5561

Revert VLM support in `parse_response`#5561
qgallouedec wants to merge 7 commits intomainfrom
revert-vlm-support-in-parse_response

qgallouedec commented Apr 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 15, 2026

Uh oh!

albertvillanova left a comment •

edited

Loading

Uh oh!

qgallouedec commented Apr 16, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qgallouedec commented Apr 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 15, 2026

Uh oh!

albertvillanova left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Apr 16, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qgallouedec commented Apr 15, 2026 •

edited by cursor Bot

Loading

albertvillanova left a comment •

edited

Loading